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ABSTRACT 


The NASA Seasonal-to-Interannual Prediction Project (NSIPP) has developed an ocean data 
assimilation system to initialize the quasi-isopycnal ocean model used in our experimental 
coupled-model forecast system. Initial tests of the system have focused on the assimilation of 
temperature profiles in an optimal interpolation framework. It is now recognized that correction 
of temperature only often introduces spurious water masses. The resulting density distribution 
can be statically unstable and also have a detrimental impact on the velocity distribution. Several 
simple schemes have been developed to try to correct these deficiencies. Here the salinity field 
is corrected by using a scheme which assumes that the temperature-salinity relationship of the 
model background is preserved during the assimilation. The scheme was first introduced for a z- 
level model by Troccoli and Haines (1999). A large set of subsurface observations of salinity 
and temperature is used to cross-validate two data assimilation experiments run for the 6-year 
period 1993-1998. In these two experiments only subsurface temperature observations are used, 
but in one case the salinity field is also updated whenever temperature observations are available. 

The effectiveness of the Troccoli and Haines scheme is reflected not only in a better salinity field 
but also in an improved temperature field. The root-mean-square difference (RMSD) between 
the assimilation analyses and observations in the equatorial Pacific shows an average 
improvement in the upper 900m of 20% in the salinity field and of 6% in the temperature field. 
The impact of the subsurface assimilation has been also assessed via data retention experiments 
(simulated forecasts). The RMSD diagnostic for these “forecasts” increases only moderately up 
to a 6-month lead, showing the retention for several months of information from the assimilation. 
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1. Introduction 


Assimilation of subsurface temperature ( T) observations has improved the quality of ocean state 
analyses. When these analyses are used to initialize coupled-model predictions, the forecast skill 
also improves (e.g., Ji et al., 1998, and Segschneider et al., 2001). However, it has been shown 
that the univariate assimilation of temperature has deleterious effects on the salinity (S) field and 
hence on the density field (Cooper, 1988, Acero-Schertzer et al., 1997, Ji et al., 2000, Troccoli et 
al., 2002, the latter hereafter referred to as TBS02). 

Some solutions to this problem have already been put forward. The proposed solutions can be 
broadly divided into two main categories: those that rely on direct observations in order to 
construct some historical correction (Vossepoel et al., 1999, Maes and Behringer, 2000, and 
Vossepoel and Behringer, 2000) and those that draw statistical or physical information from the 
model itself [e.g., through an Ensemble Kalman Filter (EnKF: Keppenne and Rienecker, 2002) or 
model statistics, or from local model background water mass properties as in Troccoli and Haines, 
(1999), hereafter TH99]. Given the scarcity of direct salinity measurements, applications of the 
first approach are necessarily limited as their training period can only refer to restricted time-space 
intervals. This restriction reflects on the limited variability contained in the statistics. This situation 
appears unlikely to change for a few years to come, although efforts for a better salinity observing 
network are underway, especially with Argo floats (e.g., http://www.argo.ucsd.edu). On the other 
hand, the more complex primitive equation models, such as the HOPE model at the European 
Centre for Medium-Range Weather Forecasts (ECMWF, see TBS02) and the Poseidon model at 
the NASA Seasonal-to-Interannual Prediction Project (NSIPP, see Coles and Rienecker, 2001), 
have proven capable of simulating the water-mass characteristics quite well. This has allowed the 
development of methods belonging to the second category. However, it is one of the objectives of 
this paper to further validate the water-mass characteristics of the Poseidon model using a 
substantial number of subsurface observations. 

In this work, as in TBS02, the focus is on the use of model-derived water-mass properties ( T and 
S ) to correct the model salinity commensurate with the temperature corrections made by 
assimilating temperature observations. The salinity increments are calculated according to the 
temperature analysis by preserving the model's local T-S relationships as described in TH99. The 
TH99 (salinity) scheme does not make any use of statistics for the salinity adjustment. Two other 
appealing advantages of this approach are computational costs and model portability. 

An initial evaluation of the TH99 scheme, using a z-coordinate ocean model, was presented in 
TBS02. Subsequently, Segschneider et al. (2001) showed that when the TH99 scheme is added to 
their combined temperature and sea surface height assimilation, it improves the ECMWF seasonal 
forecast on lead times greater than 3 months, and up to 6 months. The same scheme is 
implemented here in a quasi-isopycnal model to test whether a different model formulation would 
also benefit from the TH99 scheme. The validation of the TH99 scheme, undertaken in TBS02 by 
a comparison with climatologies, is extended here through a comparison with an abundant set of 
independent (i.e., not assimilated) observations and by using different diagnostics. As a further 
step, the retention of information - either improved or degraded structure, depending on whether 
salinity is adjusted or not - is assessed in simulated forecast mode, that is, when the assimilation 
ceases but the prescribed time-dependent forcing is continued. This experiment has the advantage 
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over coupled forecast tests in that the shock or drift usually found to dominate initial evolution of 
coupled models is not present to confuse the interpretation of results. 

In section 2 the ocean model, the data assimilation system and the experiments are described. 
Comparisons with comprehensive observation analyses are presented in section 3 for the 
assimilation experiments. In section 4, the simulated forecasts are analyzed. Finally, a discussion 
is presented in section 5. 


2. The Model and Assimilation System 
2.1 The ocean model 

The ocean model used in this study is the reduced-gravity quasi-isopycnal Poseidon ocean 
model, Version 4, which uses a generalized vertical coordinate designed to represent a turbulent, 
well-mixed surface layer and nearly isopycnal deeper layers. Coastal topography is represented, 
but the reduced-gravity treatment precludes the use of variable bottom depth. Poseidon has been 
documented and validated in hindcast studies of El Nino (Schopf and Loughe, 1995) and has 
since been updated to include prognostic salinity ( e.g ., Yang et al., 1999). More recently, the 
model has been used in an investigation of the annual cycle in the eastern Equatorial Pacific (Yu 
et al., 1997) and in a numerical study of the surface heat balance along the equator (Borovikov et 
al., 2001). 

Poseidon’s prognostic variables are layer thickness, h(A, 6, £ t ), temperature, T(A, 6, £ t), 
salinity, S(A, 6, £, t), and the zonal and meridional current components, u(A, 6, £, t) and 
v(A, 6, t), where A is longitude, Platitude, t time and <fis a generalized vertical coordinate 

which is 0 at the surface and increments by 1 between successive layer interfaces. 

Explicit detail of the model, its vertical coordinate representation and its discretization are 
provided in Schopf and Loughe (1995) and are only summarized here. The equation for mass 
continuity is 


|^ + V.(W0 + 3y = 0, (2.1.1) 

at 

where V. and v are the two-dimensional (2D) divergence operator and velocity vector and w e 
represents the volume flux across layer interfaces, including freshwater flux through the surface. 

The heat equation is 


^ + V, v ,D + ^ = A(£gj + | + ^ (r) , 


( 2 . 1 . 2 ) 


where Q is the external heat flux, /fis a heat diffusivity and F h is a 2D smoothing operator. The 
salinity equation is 


2 



M + V (vW) + ^ = AMs 

dt ' ( } d£ dclhdC. 




+ hF h (S), 


(2.1.3) 


where k s is a salinity diffusivity. The 2D momentum equation is 


d(\h) 

dt 


+ V.(v/iv) + 


dw e \ 


= Wp'-bhVz- f kxv+ — 


Po 


v 3v 


d£\h d£) p 0 dC 


+ -Lp-+hF v (y), (2.1.4) 


where i>is a vertical friction, ris the vertical shear stress, /kxv is the Coriolis acceleration and 
F v is a dissipation term. A hydrostatic Boussinesq approximation is made, so that if p'(z) is the 

pressure anomaly at depth z, b is buoyancy and p 0 is the mean density, the hydrostatic equation 
then becomes 


Following Pacanowski and Philander (1981), vertical mixing is parameterized through a 
Richardson number-dependent mixing scheme implemented implicitly. An explicit mixed layer 
is included with a mixed layer entrainment parameterization following Niiler and Kraus (1977). 

A time-splitting integration scheme is used whereby the hydrodynamics are done with a short 
time step (15 minutes), but the vertical diffusion, convective adjustment and filtering are done 
with coarser time resolution (half-daily). 

2.2 Model setup 

The version of Poseidon used here has been parallelized as in Konchady et al. (1998) using the 
same message-passing protocol and 2D horizontal domain decomposition used by Schaffer and 
Suarez (2000) for the atmospheric model. Many of the details of the parallel implementation of 
both the model and the assimilation system can be found in Keppenne and Rienecker (2001). 

The model, in a global configuration, is used for NSIPP’s coupled forecast system. However, for 
this study the domain was restricted to the Pacific Ocean, from 45°S to 60°N. The horizontal 
resolution was l°x 1°, plus an equatorial refinement: the meridional resolution changes 
smoothly from 1/3° at the equator to 1° within 10° of the equator (Figure 1). There are 20 layers 
in the vertical. At the southern boundary temperature and salinity are relaxed towards the 
Levitus climatology. There are 173 x 164 x 20 grid boxes, of which 28% are situated over land. 

The model is forced by daily averaged wind stresses derived from the Special Sensor 
Microwave/Imager (SSM/I) by Atlas et al. (1996). The same wind product, but monthly averaged, 
is used to derive the sensible and latent heat components through the atmospheric mixed layer 
model by Seager et al. (1995). The precipitation is given by the monthly averaged analyses of Xie 
and Arkin (1997). An additional relaxation to climatology is applied to the surface salinity field 
with a time scale of nine months. 
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Figure 1. Horizontal domain decomposition for the Pacific model. The thin lines delineate grid cells. The thick 
lines correspond to the boundaries of each Processing Element (PE) on a 16 x 16 PE lattice. The dark circles show 
the locations of the TAO moorings. 


2.3 Data Assimilation System 

The data assimilation system in this study is composed of two parts: a univariate optimal 
interpolation system (UOI) to calculate the temperature analysis increments, followed by the 
TH99 scheme to calculate salinity increments consistent with the temperature analysis. The 
temperature analysis is generated according to 


[HP / H r +W] b=d-Hx / , K } 

x"=x / +P / H T b. ( 2 - 3 - 2 ) 

In (2.3.1) and (2.3.2), uppercase boldface symbols represent matrices, and lowercase boldface 
symbols represent vectors. The vector d(n d x 1) contains n d observations; x (n x x 1) is the state 
vector. The superscripts a and 1 refer to the analyzed state and the forecast, respectively. The 
matrix W (n d x n d ) is the observational error covariance matrix. The representer matrix. 
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R = HP H 1 , maps the background (forecast) error covariance matrix, P f (n x x n x ), to the error 
subspace of the measurements. The elements of b are the representer-function amplitudes used 
to update x. The rid x 1 vector, z =d-Hd , contains the innovations. 

In this application, the measurement functional, Hx, is simply a 3D interpolation operator which 
maps the model temperature field, assumed to be at the center of the model layer, to the latitude, 
longitude and depth of each observation. The background-error covariance used in the UOI is 
constant in time. Here the function depends only on the distance between forecast locations, and 
a Gaussian functional form is chosen: 

P'(AM0,Az) = Ccxp{-iAA/L x ) 2 -(A<p/L/-(Az/L z ) 2 }, 

where Lx defines the zonal decorrelation scale, the meridional decorrelation scale and L z the 
vertical decorrelation scale. In this application, Lx = 1800 km, L^= 400 km in the equatorial 
waveguide, and L z = 50 m. The horizontal scales are consistent with those used by Ji et al. 
(1995). The value for Lx is modulated meridionally as suggested by Derber and Rosati (1989) to 
shorten the covariance scales with latitude. 

2.4 Salinity increments to preserve water-mass distributions. 

Troccoli and Haines (1999) demonstrated the deleterious effects of assimilating temperature 
profiles without making corresponding adjustments to the salinity profile at each model grid 
point influenced by the temperature assimilation. The problems arise from the introduction of 
gravitationally unstable profiles through the generation of new water masses (e.g., Figure 2). 
They proposed a scheme to preserve the water-mass distribution of the model prior to 
assimilation. The idea stems from the fact that vertical displacements of the water column, 
because of internal wave motion or the passage of mesoscale features, can occur without 
significant changes in the water mass properties. Even for the case of non-monotonic S=S(T), 
two (or more) isothermic parcels can be distinguished according to their depth and the TH99 
scheme recovers salinity from the nearest T(z) in the background field. 

The scheme is presented fully in TH99 and TBS02 and is presented only briefly here. The 
procedure is applied at each model grid point in two steps. First, a vertical displacement of the 
model T background profile to match the deepest analyzed T is made. The same displacement is 
applied to the S profile, too. Second, the scheme computes an S increment using the T-S 
relationships from the model T- and S-background profiles and the analyzed T, at each grid point, 
according to the following formulation: 

*S(-2an) = S b g (-Zbg ) if I •5an — -2feg | — Az 

S(za n ) = S bg (-San) if there is no -zfe g such that T bg (^ g ) = T an (^ n ), 

or | Zan - -Zfeg I > Az 

where A-sis a specified depth tolerance, which could be a function of location. In our case we 
chose a fixed Az =100m. Subscripts an and bg stand for analysis and background, respectively, 
and - 2 fc g = z(T bg =T an ) is the depth at which the background temperature is the same as the analysis 
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temperature. In case of multiple 47b g =r an ) solutions, the nearest depth to Za n is considered. 
Also, as the T-S preservation assumption generally does not hold near the surface, the salinity is 
not updated in the surface isothermal layer. 
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Figure 2. Schematic representation of the two analyses (right) obtained by combining model and XBT water 
columns for the monotonic-profile case, following TH99. The shaded layers in the lower analysis profile highlight 
unstable water masses introduced by the vertical displacement of the temperature profile without corresponding 
displacement of the salinity profile. 0 is potential temperature. 


Thus, the TH99 scheme only needs three easily retrievable ingredients: (1) the analyzed 
temperature profile, (2) the model temperature, and (3) salinity profiles for each model grid 
point. The analyzed temperature profile can be the result of any analysis method such as optimal 
interpolation. 
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2.5 Incremental analysis 


Incremental analysis updating (IAU, e.g., Bloom et al. 1996) is used to insert the analysis 
increments, x a -x f , into the model in a gradual manner. Namely, the model partial differential 
equations (2. 1.1 -2. 1.4) are replaced with 


<>< 

where F stands for the right hand sides of (2. 1 .2-2. 1 .3) and 
and forecast at the time, of the ith analysis. 


ti<t<t i+l . 


(2.5.1) 


x a (t i ) and jc f (t i ) are the analysis 


Unlike nudging (e.g., Daley 1991), which relaxes the model state toward an analysis, the analysis 
increments are inserted as a state-independent forcing term. The IAU has properties similar to 
those of a low-pass filter and can improve observed-minus-forecast statistics with respect to a 
non-incremental updating scheme (Bloom et al. 1996). 


The IAU is used here for two reasons. First, it lessens the unwanted effects of intermittent data 
assimilation, specifically initialization shocks resulting from imbalances between the model 
fields following the direct insertion of the analysis increments. Second, the IAU allows the 
model to gradually adjust the h field in response to the T and S increments without violating the 
constraints imposed by the continuity equation (2.1.1). 


2.6 Description of the assimilation experiments. 

The subsurface temperature measurements employed in this study include all the observations 
available in real-time from the Global Telecommunication System, including XBTs (expendable 
BathyThermographs), TAO (Tropical Atmosphere-Ocean) mooring data (e.g., McPhaden et al., 
1998), and profiling floats. The observations were quality controlled at the National Center for 
Environmental Prediction (NCEP). Two experiments have been run to examine the ocean 
structure evolution. For the first, only temperature is updated (experiment TOI). For the second, 
salinity as well as temperature is updated (experiment TOIS ), with the salinity increments given by 
the TH99 scheme. For reference, a control run with no data assimilation (experiment CNT) is used 
to check how the data assimilation affects systematic model errors. The three runs all use the 
ocean model set-up described in section 2.2. The initial conditions were taken from a spun up 
simulation where climatological forcing was applied for 10 years and then daily forcing from 1988 
onwards. The experiments in this study were run for the 6-year period 1993-1998. 


3. Comparison with subsurface observations 

Subsurface salinity observations are rather scarce and so it is generally difficult to validate model 
and/or assimilation results in terms of the salinity field. A recent paper by Johnson et al. (2000) 
(hereafter JMRM) provided a thorough data analysis of salinity from Conductivity-Temperature- 
Depth (CTD) observations for the equatorial Pacific for the period from September 1996 to 
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November 1998. That dataset has since been extended. This study uses all the CTD observations 
available for the period 1993-1998. The transects are located in the zone between 165°E and 
95 °W and normally extend from 8°S to 8°N (see Figure 3). The temperature and salinity 
accuracies are 0.002°C and 0.003 respectively. The JMRM observations were mapped onto a 
high resolution grid with 1/5° meridional spacing and 10m vertical spacing. In the following, it 
should be borne in mind that the observed fields are gridded quasi-synoptic one-time sections 
whereas for the model fields we will consider monthly averages. 


Observations locations 



Figure 3. Observation locations are denoted by the rectangles whose vertical sides are proportional to the latitudinal 
extent of the transects. The horizontal line inside the rectangle identifies the equator. The crosses indicate a subset of 
the transects occupied in April-September and used for data retention experiments. There are 69 transects in total, of 
which 35 are used for the data retention experiments. 

3.1 The 155°W section 

The physical structures of the analyses from the different experiments have been compared with 
that from JMRM's analysis. Here, the focus is on the meridional section at 155°W. This choice is 
motivated by good temporal and spatial coverage (Figure 3), by location (the central Pacific), and 
by the strength of the salinity signal (there is a pronounced salinity tongue). 


1 Salinity is unitless, as defined by the 1978 Practical Salinity Scale. 
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Let us first consider the temperature field, retrieved from the same CTD casts as salinity, so as to 
appreciate the modifications introduced by the 01 procedure. Figure 4 shows the meridional 
transects across the equator at 155°W for JMRM's analysis, TOI, TOIS and CAT (from left to right, 
respectively) for the five months presented by JMRM. We will mainly focus on two features: the 
vertical temperature gradient and the meridional thermocline gradient. 

The thermocline is considerably tighter in the observation, than in the control run, CAT, especially 
in November 1997. Observed gradients are up to 1.5°C over 10m. In the TOI and TOIS 
assimilation runs the thermocline becomes tighter than in CAT, and hence closer to the 
observations. Yet, the vertical gradient in TOI and TOIS is not as strong as in the observations. 

The difference is mainly due to the fact that monthly averages of the model fields are being 
considered. In fact, other analyses have shown that the vertical displacement of the thermocline on 
even shorter time scales (about 10 days) can be as large as 40m or more, which can translate to a 
temperature variability of about 2.5°C (cf. Figure 2 in TH99). Similar considerations will also 
apply to the salinity field for which the same displacements can lead to a salinity variability of up 
to about 0.2. 

Taking the 18°C isotherm as a reference, it can be seen that the meridional gradient of the 
thermocline is reasonably well simulated by the control run (compare the first and last columns of 
Figure 4). The main contribution of the assimilation runs is in the region of the pronounced 
thermocline ridge near 6-8°N where both meridional and vertical gradients are sharpened by the 
assimilation. Another feature worth noting is the bulge present in the 12°C isotherm in some of 
the TOI transects, such as at 4°S in December 1996 and at the equator in November 1997. We will 
come back to this point later in this section. 

The salinity fields for the same five transects are presented in Figure 5. For the comparisons of 
salinity structure we focus on three regions: the subsurface both south (i.e., the salinity tongue) 
and north of the equator, and the upper water column. The shape of the salinity tongue is well 
captured by CAT Even the excursions of the 35 isohaline across the equator during 1997 (an El 
Nino year) are reproduced reasonably well. However, the values are often 0.2-0.4 higher than 
observed. North of the equator, the subsurface salinity structure is rather uniform and close to 
observations, although north of about 4°, the observed values are normally underestimated by 
about 0.2 in the model. Near the surface, differences are less systematic. South of the equator, 
CNT is often saltier than observed; however, except for poleward of 4°S in November 1997, 
differences are within 0.2 to 0.4. 

The salinity tongue comparison between TOI and TOIS shows that, although the high salinity 
values are present in both runs, the shape of the tongue is generally too diffuse in the TOI run. In 
comparison, TOIS captures the observed features better. Salty water observed in the lower 
thermocline is associated with eastward advection of salty water in the Equatorial Undercurrent 
(EUC). This contrasts with surrounding westward advection of fresher water in the South 
Equatorial Current (JMRM). Similar features but with a smaller magnitude are also present in 
TOIS and CNT. A striking feature is that in all the panels for the TOI run, a column of salty water 
(S > 34.8) extending a few hundred meters below the thermocline is prominent just south of the 
equator. 
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Figure 4: Meridional sections of temperature across the equator at 155°W for Johnson et al (2000) (JMRM), TOI , 
TOIS and CNT for the five months presented by JMRM (cf. JMRM's plate 2). The equator is highlighted in white. 
Isolines are thicker for 8, 18 and 28°C. 
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Figure 5: As in Figure 4 but for salinity. Isolines are thicker for 34, 35 and 36. 


















Such features are common in cases in which temperature is assimilated with no salinity correction, 
in areas with a marked salinity gradient, as discussed by the stability analysis in TBS02. As a 
result, gravitationally unstable conditions, which cause warm and salty water to be entrained at 
depth, are spuriously introduced in the model. This is an undesirable feature as the spurious water 
mass may persist in the model for a long time. On the other hand, with TOIS, the marked columns 
of salty water seen in TOI have almost entirely disappeared and TOIS is closer in character to the 
observations. 

North of the equator, TOI and TOIS do not differ much from each other and both suffer from a 
fresh bias. This bias, as observed earlier, is a legacy of the control run and therefore it is not 
possible to eliminate it with the adopted assimilation approach. 

It is interesting to note that in TOIS, the near-surface salinity is generally better simulated than in 
both CNT and TOI, even though it is not corrected. This is probably a reflection of the improved 
subsurface salinity structure which may be entrained into the surface layer during mixing. In fact, 
the surface forcings are essentially the same as those in TOI, and differ from CNT solely by the 
evaporation component which, however, does not seem to play a dominant role (compare TOI and 
CNT). Note also that the three runs are all subject to the same sea surface salinity relaxation. Even 
the surface zonal current is reproduced better by TOIS. 

3.2 Cross-validation statistics 

A cross-validation against all available JMRM temperature and salinity observations was 
conducted by calculating the root-mean-square difference (RMSD) for the three runs (Figure 6). 
For each model variable we consider the nearest observation value. The 69 transects available are 
then grouped into two sets: west equatorial Pacific (WEP), for longitudes between 160°E and 
150°W, and east equatorial Pacific (EEP) between 150°W and 90°W. 

As expected, the temperature RMSD is notably reduced in the two assimilation runs with respect 
to the CNT run (Figures 6a, b). This reduction is more accentuated in the EEP, where the RMSD 
in CNT is larger, by more than 0.5°C, than in either TOI or TOIS in most of the thermocline, that is 
from 50m to 250m. The reduction is marked even in the WEP but the magnitude is generally 
halved. Some small differences appear also between TOI and TOIS at thermocline depths, with 
TOIS generally closer to observations, especially between 50m and 100m in the EEP where the 
two RMSD differ by up to 0.25°C. More important, however, are the differences between TOI and 
TOIS below 50m. This is especially valid for the EEP since in the WEP the differences are 
marginal. In the EEP, in fact, the differences reach about 0.3°C at 450m. The larger RMSD for 
TOI reflects the worsening of the thermal structure below the thermocline as a consequence of the 
instabilities introduced as discussed above. Therefore, it is a positive feature of the TH99 scheme 
that by changing the salinity field, even the thermal structure is improved. The TOIS reduces the 
temperature RMSD with respect to CNT by a considerable 17%. 
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Figure 6: Root-mean-square difference (RMSD) between the three model runs (TO/, TO/5 and CNT) and the 
observations as a function of depth for the 69 available transects in Figure 3: west equatorial Pacific (a, c) and east 
equatorial Pacific (b, d); temperature RMSD (a-b) and salinity RMSD (c-d). 


It is very encouraging that the salinity RMSD in TOIS is systematically smaller than in TOI in both 
the EEP and the WEP (Figures 6c, d). Their difference reaches a peak of 0. 16 at 50m in the WEP 
and 0.1 at 350m in the EEP. Note that below 250m in the EEP, the salinity RMSD trend largely 
reflects that for the temperature RMSD, again a consequence of the enhanced mixing which 
entrains saltier water at depth (see also the next section 3.3). The improvement of TOIS compared 
to TOI is relatively larger for salinity than for temperature. Considering the two regions, WEP and 
EEP, together, we obtain an average of 6% improvement for temperature and 20% for salinity. 

The salinity RMSD for TOIS is also marginally better than that for CNT, with an average 
improvement of 3%. 
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As mentioned above, the current structure is often also improved with TOIS. RMSD statistics 
calculated using ADCP observations for the EEP region showed an improvement in TOIS of 10% 
and 5% with respect to TOI and CNT, respectively. However, the improvements in zonal velocity 
are generally less than the uncertainty (5cm s' 1 ) in the observations. 

3.3 Temperature-Salinity Relations 

The detrimental impact of the TOI on the water-mass distribution is apparent in a comparison of 
T-S pairs for the observations, TOI and TOIS. Since the T-S characteristics differ between north 
and south of the equator, we further divide the two regions, EEP and WEP, considered above. We 
therefore account for four regions as presented in Figure 7. 

The differences in T-S distributions between regions is clear from Figure 7. The T-S relation in the 
north WEP is generally tighter than in the other regions, with very little variation in salinity, but 
with a slight subsurface saline maximum associated with North Pacific Tropical Water (NPTW) 
and South Pacific Tropical Water (SPTW) which penetrates across the equator (see Johnson and 
McPhaden, 1999). The subsurface salinity maximum of the SPTW is a significant feature in the 
south WEP. In the south EEP, although the water is fresher at depth, the subsurface salinity 
maximum is less apparent because the surface waters are more saline in this region outside the 
South Pacific and Intertropical Convergence Zones. Hence, even in this narrow region, the TH99 
scheme is tested in different T-S regimes. 

Figure 7 is produced by considering the T-S pairs at each observation location and the model 
values interpolated to the observation locations. The T-S pairs are accumulated on a T-S grid 
whose granularity is 0.25°C by 0.1. Where at least one T-S pair exists, a colored circle is plotted. 
This implies that each T-S grid point is given the same weight. However, by neglecting T-S grid 
points with a low occupancy number, we checked that the features shown in Figure 7 are robust. 

The red color signifies that the T-S characteristics of all three ocean representations, TOI, TOIS and 
the observations, coincide. Manifestly, red is the predominant color in all four regions, implying 
that the two assimilation runs produce a good representation of the observed ocean. However, 
other features are also apparent. North of the equator (Figures 7a, b), the assimilation fields (as 
well as the control run, not shown) have fresher values than observations as seen from the 
abundance of violet circles for low salinities and, concurrently, from the cyan circles on the higher 
salinity range. These values mostly relate to fresher surface waters, often mixed into the upper 
thermocline (see the sections at 155°W for example) and result from deficiencies in the surface 
freshwater flux. In the north EEP, the fresher north Pacific influence in the model is also apparent. 
The few cyan circles in the figures for the northern hemisphere also indicate model deficiencies: 
in the penetration of NPTW equatorward in the WEP, and in the cross-equatorial penetration of 
SPTW in both WEP and EEP. There are very few black circles to indicate problems in the North 
Equatorial Pacific. In this region, variations in salinity in the equatorial waveguide are mainly in 
the surface layer, so there is little that the TH99 scheme can (or has to) do to rectify the model's 
salinity as the temperature is changed through assimilation. 
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Figure 7: Temperature-Salinity diagrams for TO/, TO/5 and observations (OBS) for the four regions defined on each 
panel. The color convention is explained in panel c. For instance, the red color means that both TO/ and TO/5 agree 
with observations. Superimposed dotted lines are the a, isolines. 











South of the equator the representation of T-S in the TOIS model is quite good (Figures 7c, d). On 
the other hand, a significant volume of cool, saline water is readily apparent in the TOI fields, as 
shown by the numerous black circles. These circles show that the problems associated with the 
univariate assimilation of temperature with no salinity correction are most significant in the 
southeastern Tropical Pacific. However, in the western basin (north and south) black circles are 
also noticeable on the fresh side of the T-S pairs. 


4. Data Retention 

The obvious question arises as to whether the improved states from TOIS have any impact on 
seasonal forecast skill. As an initial attempt at answering this question, we have conducted forced 
ocean experiments, to be regarded as simulated forecast experiments, but with “perfect” surface 
forcings. These experiments have the advantage of avoiding the problem of initialization shocks 
and drifts of coupled ocean-atmosphere forecasts. The assimilation fields are used as initial 
conditions for the simulation, but after initialization no further assimilation has been undertaken. 
The experiments thus measure the retention of the information provided by assimilation and 
whether the undesirable features of the TOI states impact the subsequent ocean evolution or are 
ameliorated by ocean forcing and mixing. The error statistics from these simulations are compared 
with those from a pure simulation mode (the control) and from the continued assimilation case. 
These simulated forecast experiments were taken from the six April 1 dates of 1993 to 1998 and, 
in each case, are integrated for a six month duration. 

We analyze the results of these forecasts by means of two diagnostics. The first uses RMSD as in 
Figure 6, whereas the second considers the RMSD for two specific subsurface variables as a 
function of lead time. 

The RMSD diagnostic as a function of depth is shown in Figure 8 and now includes only the 35 
transects that fall within the 6-month forecast interval, April to September (crossed transects in 
Figure 3). All the months have equal weighting in the RMSD calculation. The simulated forecasts 
initialized by TOI (TOIS) analyses are labeled as f_TOI, (f _TOIS). The overall results are hardly 
distinguishable from those in Figure 6, indicating that the information in the initial conditions is 
retained for at least 6 months. Only at thermocline depths in the EEP is there a notable increase in 
the RMSD by about 0.2°C for both i_T01 and f _TOIS, compared to TOI and TOIS , respectively 
(Figure 8b). Encouragingly, the f JTOIS salinity RMSD almost overlaps that for TOIS, implying 
that the subsurface information is retained on the timescales considered. The same holds true for 
the i_TOTTOl pair, but as for the assimilation runs, the absolute RMSD values are markedly 
larger than for the i_TOIS-TOIS pair. 

To investigate the evolution of the errors with time, we analyzed the RMSD for the depth of the 
19°C isotherm depth (Figures 9a, b) and for the salinity on the 25 sigma-theta surface (Figures 9c, 
d). These variables were chosen because, unlike the 20°C isotherm and the 24.5 sigma-theta 
surface representing the equatorial thermocline, they do not outcrop in the 35 transects. Over the 
six-month lead times, the 35 transects are thus distributed: 5, 6, 2, 4, 7 and 11. Ultimately, 
without further assimilation, the states from i_TOI, f _TOIS, and CNT would eventually evolve to 
be quite similar (within predictability limits). 
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Figure 8: As in Figure 6 but for the two assimilation runs, TOl and TOIS, and the two simulated forecasts, i_T01 and 
fJTOIS, and for the 35 crossed transects (see Figure 3). 


However, figure 5 clearly shows that the improved initial conditions provided by both 
assimilations are retained for at least six months. The TOl states produce a better simulation than 
the control, except for the salinity north of the equator. The improved state resulting from TOIS is 
also generally retained throughout the forecast period, so we conclude that the correction of 
subsurface salinity should have a positive impact on seasonal forecast skill. South of the equator, 
the improvements in the state estimate are retained for six months and generally the salinity errors 
introduced by TOl seem to be ameliorated somewhat once the assimilation is stopped. North of 
the equator, however, the errors in salinity introduced by the assimilation persist through the six- 
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month period. This would indicate that direct salinity observations north of the equator would be 
beneficial. 
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Figure 9, Temporal evolution of the composite RMSD of the 19°C isotherm depth (a-b) and of the salinity on the 25 
sigma surface (c-d) for all lead times for the control run ( CAT ), the two assimilation runs ( TOI and TOIS ), and the 
simulated forecasts (f _TOI and f _TOIS). The RMSD calculation is split in south (a, c) and north (b, d) of the equator. 
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5. Summary and discussion 

The general applicability of the salinity correction scheme documented in TH99 and TBS02 is 
tested here in an isopycnal model rather than a z-coordinate model. The scheme is validated more 
stringently by comparison with a large volume of time- varying temperature and salinity 
observations collected along 69 transects across the equatorial Pacific over the 6-year period 1993- 
1998 (Johnson et al., 2000, plus subsequent extension). This scheme has been implemented in the 
data assimilation system employed for NSIPP's routine forecasts. 
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Even within the equatorial waveguide, the TH99 scheme is tested in different T-S configurations, 
with variations in salinity structure across the equator and along the equator as seen in Johnson and 
McPhaden (1999). The root-mean-square differences between the assimilation estimates of the 
salinity and temperature fields and these observations offer an encouraging assessment of the 
salinity-correction scheme when compared to a conventional OI procedure that does not update 
salinity. For instance, the improvement in the salinity field is 20% when the TH99 scheme is used 
compared to when only the temperature field is updated. Another evaluation that was carried out 
with this work is that of the impact of subsurface initializations on simulated forecasts. We 
showed that both the subsurface salinity and temperature benefit from the TH99 scheme for all 
lead times considered (up to 6 months) in the south equatorial Pacific. The north equatorial Pacific 
has proven to be a more difficult region to improve, given the presence of systematic errors in the 
model. It is perhaps this region that could most benefit from direct salinity observations. 

Whether the TH99 salinity scheme performs better than other approaches such as multivariate OI, 
based on observations (e.g., Maes and Behringer, 2000) or model-based covariances, or the 
Ensemble Kalman Filter is a matter that will be assessed in the near future. It is apparent though 
that, given its simplicity, the TH99 scheme is effective and easily implemented in any ocean 
model. Also, it does not need any observation or model climatology that can limit the variation 
spread. However, the TH99 scheme strongly relies on the model dynamics to get a good salinity 
field reconstruction and therefore it can only be applied to ocean models that simulate the 
distribution of water masses reasonably well. It has been shown in this study, as well as in TBS02, 
that the two primitive equation models in question, Poseidon and HOPE, satisfy this requirement 
well (although improvements are needed in the northern equatorial Pacific). It might also be 
argued that the success of such a simple scheme lies in its aim of preserving the density balances - 
through the conservation of the isopycnal layering - present in the model prior to assimilation. It 
appears that initializing the model with a balanced density field is crucial in a sequential data 
assimilation framework. The TH99 scheme could therefore be employed by even more complex 
assimilation systems for initialization purposes. 
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